Classification and Comparative Study of Data Mining Classifiers with Feature Selection on Binomial Data Set
نویسنده
چکیده
This paper describes about the performance analysis of different data mining classifiers before and after feature selection on binomial data set. Three data mining classifiers Logistic Regression, SVM and Neural Network classifiers are considered in this paper for classification. The Congressional Voting Records data set is a binomial data set investigated in this study is taken from UCI machine learning repository. The classification performance of all classifiers is presented by using statistical performance measures like accuracy, specificity and sensitivity. Gain chart and R.O.C (Receiver Operating Characteristics) chart are also used to measure the performances of the classifiers. A comparative study is carried out among the data mining classifiers. Experimental result showed that without feature selection Logistic Regression and SVM classifiers provides 100% accuracy and neural network provides 98.13 % accuracy on test data set. With feature selection SVM classifier provides 100% accuracy. The performance of SVM classifier is found to be the best among all the classifiers with reduced number of features.
منابع مشابه
Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets
Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملH-BwoaSvm: A Hybrid Model for Classification and Feature Selection of Mammography Screening Behavior Data
Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation for participating in mammography screening programs. Todays, intelligence systems could...
متن کاملEvaluation of Classifiers in Software Fault-Proneness Prediction
Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...
متن کاملSupport Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کامل